Fast Constrained Submodular Maximization: Personalized Data Summarization
نویسندگان
چکیده
Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first practical and FAst coNsTrained submOdular Maximization algorithm, FANTOM, with strong theoretical guarantees. FANTOM maximizes a submodular function (not necessarily monotone) subject to the intersection of a p-system and l knapsacks constrains. It achieves a (1+ )(p+1)(2p+2l+1)/p approximation guarantee with onlyO( log(n) ) query complexity (n and r indicate the size of the ground set and the size of the largest feasible solution, respectively). We then show how we can use FANTOM for personalized data summarization. In particular, a p-system can model different aspects of data, such as categories or time stamps, from which the users choose. In addition, knapsacks encode users’ constraints including budget or time. In our set of experiments, we consider several concrete applications: movie recommendation over 11K movies, personalized image summarization with 10K images, and revenue maximization on the YouTube social networks with 5000 communities. We observe that FANTOM constantly provides the highest utility against all the baselines. Proceedings of the 33 rd International Conference on Machine Learning, New York, NY, USA, 2016. JMLR: W&CP volume 48. Copyright 2016 by the author(s).
منابع مشابه
Subtree Extractive Summarization via Submodular Maximization
This study proposes a text summarization model that simultaneously performs sentence extraction and compression. We translate the text summarization task into a problem of extracting a set of dependency subtrees in the document cluster. We also encode obligatory case constraints as must-link dependency constraints in order to guarantee the readability of the generated summary. In order to handl...
متن کاملLearning Sparse Combinatorial Representations via Two-stage Submodular Maximization
We consider the problem of learning sparse representations of data sets, where the goal is to reduce a data set in manner that optimizes multiple objectives. Motivated by applications of data summarization, we develop a new model which we refer to as the two-stage submodular maximization problem. This task can be viewed as a combinatorial analogue of representation learning problems such as dic...
متن کاملThe Power of Randomization: Distributed Submodular Maximization on Massive Datasets
A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. Unfortunately, the resulting submodular optimization problems are often too large to be solved on a single machine. We develop a simple distributed algorithm that is embarrassingly parallel and it achieves prova...
متن کاملKnapsack Constrained Contextual Submodular List Prediction with Application to Multi-document Summarization
We study the problem of predicting a set or list of options under knapsack constraint. The quality of such lists are evaluated by a submodular reward function that measures both quality and diversity. Similar to DAgger (Ross et al., 2010), by a reduction to online learning, we show how to adapt two sequence prediction models to imitate greedy maximization under knapsack constraint problems: CON...
متن کاملMulti-document Summarization via Budgeted Maximization of Submodular Functions
We treat the text summarization problem as maximizing a submodular function under a budget constraint. We show, both theoretically and empirically, a modified greedy algorithm can efficiently solve the budgeted submodular maximization problem near-optimally, and we derive new approximation bounds in doing so. Experiments on DUC’04 task show that our approach is superior to the bestperforming me...
متن کامل